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"METHODS FOR THE IDENTIFICATION OF HIV ANTU VIRAL AGENTS CAPABLE OF 
ABROGATING INTEGRASE INTERACTOR PROTEIN BINDING" 



This application is a continuation-in-part of U.S. 
Serial No. 08/248,355 filed May 24, 1994, the contents 
of which are incorporated by reference. The invention 
10 disclosed herein was made with Government support under 

Grant No. A124 845 from the National Institute of 
Allergy and Infectious Disease. Accordingly, the U.S. 
Government has certain rights in this invention • 

15 Throughout this application, various references are 

referred to by author and year in parentheses. 
Disclosures of these publications in their entireties 
are hereby incorporated into this application to more 
fully describe the state of the art to which this 

20 invention pertains. Pull bibliographic citation for 

these references may be found at the end of this 
application, preceding the claims. 
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Backor *^"^^ ^- p the Tn -vmi-A^r^^ 



In the first few hours after entry into a host cell, 
retroviruses direct the reverse transcription of the 
RNA genome into DNA, and then the insertion of that DNA 
into the host genome to form the integrated provirus 

30 (Goff, 1992; Weiss et al., 1984). The integration 

reaction is essential for the successful eacpreseion of 
the viral DNA to give rise to progeny virus, and is 
responsible for the ability of the virus to persist in 
the infected cell. The reaction is a highly efficient 

35 and orderly process. Specific inverted repeat 

sequences at the termini of the linear viral DNA, 
required in els, are joined to the host DNA* The 
reaction is associated with specific alterations at the 
junctions: a small number of base pairs, usually two. 
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are lost from each of the termini of the unintegrated 
viral DNA, and a small number of base pairs initially 
present only once at the target site are duplicated so 
as to flank the integrated provirus. 

5 

A single virally encoded enzyme, integrase (IN), is 
required for the establishment of the integrated 
provirus. This enzyme is encoded by the 3' portion of 
the pol gene (Schwartzberg et al., 1984) and is 
10 pac)caged inside the virion particle in the course of 

virion assembly. During the early stages of infection, 
the protein remains associated with the viral nucleic 
acid in a nucleoprotein complex {Famet and Haseltine, 
1991) and performs several specific reactions.- first, 
15 the 3' termini of the viral DMA are cleaved to produce 

recessed 3 'OH ends, and second, the two newly generated 
3- termini are joined to the 5' phosphates on each 
strand of the target sequence in a concerted strand 
transfer reaction {Fujiwara and Mizuuchi, 1988). Only 
one strand of the viral DNA at each terminus is joined 
to each strand of the target DNA. The positions of 
attack by each 3' OH end on the two target DNA strands 
are staggered, such that the initial product contains 
gaps; host repair enzymes are thought to be responsible 
for removing unpaired bases, filling in gaps, and 
ligating the second strand. These repair steps result 
in the formation of the target site duplication 
flanking the provirus. 



20 
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It is possible that some host proteins are directly 
involved in promoting the integration reactions 
occurring after viral infection. Although recombinant 
integrase preparations can carry out all the steps 
known to be required for processing and joining the 
viral DNA (Bushman and Craigie, 1991; Bushman et al., 
1990; Craigie et al, 1990; Katz et al . , 1990). some 
aspects of the reaction are not fully recapitulated in 
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vitro. For example, the isolated proteins show only 
very low specific activity for both cutting and joining 
of DNA (Bushman et al . , 1990; Craigie et al . , 1990). 
Furthermore, joining reactions carried out with 
oligonucleotide substrates for some viruses result in 
the transfer of only one 3 'OH to the target DNA 
yielding a Y structure, rather than the concerted 
transfer of two 3 'OH termini to the target (Bushman et 
al,, 1990} . These inadequacies of the in vitro systems 
may reflect problems with proper oligomerization of the 
IN protein, or with the absence of stimulatory 
cof actors. For some viruses, host proteins might be 
responsible for stimulation of the overall reaction in 
vivo, and, especially, for the concerted integration of 
the two termini at a single locus • 

Integration of retroviral DNA occurs on many 
chromosomes and with no apparent local sequence 
specificity (Dhar et al . , 1980; Hughes et al . , 1978; 
Shimotohno and Temin, 1980; Shoemaker et al . , 1981). 
Several studies, however, suggest that there may be 
preferred sites for integration. Proviral DNAs 
established by infection, rather than by transfection 
with cloned DNAs, seem to be more highly and 
consistently transcribed, implying that integration 
sites are selected from transcriptionally active areas 
of the genome (Hwang and Gilboa, 1984) . A significant 
bias for insertions into open chromatin was detected at 
high frequency insertion near DNAse hypersensitive 
sites (Rohdewohld et al., 1987; Vijaya et al . , 1986) 
and into transcriptionally active regions (Scherdin et 
al,, 1990) , In addition, there may be a small number 
of "hot spots", or preferred sites, which are 
frequently targeted (Shih et al., 1988). Measurements 
of the frequency of insertional inactivation into 
particular genes have been shown to give fewer events 
than predicted, suggesting that there may be «cold 
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spots" as well (King et si., 1985; Varmus et al . , 
1981) . In vdtro studies of the integration into SV4 0 
minichromosomes showed that the origin region and 
linker regions between the nucleosomes tended to 
exclude insertions, while nucleosomal regions were 
efficiently targeted; phasing of the insertions in the 
chromatin could be observed, with a 10 -bp periodicity 
(Pryciak et al . , 1991) . These results suggest that the 
presence of DNA binding proteins and histones on DNA 
can significantly perturb the target choice. 



Many of the features of retroviral integration are 
similar to those associated with transposition of 
eucaryotic and prokaryotic mobile elements. Analogous 

15 studies in various retrotransposon systems also suggest 

that target sites for integration are non-random. The 
Ty elements in yeast have been shown to exhibit 
significant target site biases; Tyl insertions tend to 
cluster near the S' end of some target genes (Natsoulis 

20 et al., 1989) and within 400 bp of tRNA genes (Ji et 

al., 1993), and Ty3 insertions are highly restricted to 
specific positions relative to polymerase III promoters 
(Chalker and Sandmeyer, 1990; Chalker and Sandmeyer, 
1992) . In these cases the integration events are not 

25 thought to be affected by the sequence itself or by 

transcriptional activity, but rather are more likely to 
be profoundly restricted by host chromosomal proteins, 
with the potential candidates for the target proteins 
being the TFlliB or TPIIIC transcription factors bound 

30 to the promoter (Sandmeyer et al,, 1990). 

The identification of host proteins that might target 
proviral integration, stimulate integration activity, 
or affect the incoming retroviral DNA in other ways 
35 would provide an important lead into new areas of 

research. In an attempt to find such proteins, the 
yeast two hybrid system has been used (Fields et al.. 
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U.S. Patent No. 5,283,173) to screen a cDNA library for 
proteins that interact with the HIV-i in. The search 
resulted in the recovery of a single novel gene, termed 
±ai-2 for intsgrase interactor 1. The predicted amino 
5 acid sequence of the ini-l protein shows an unexpected 

sequence similarity to SNPS. a yeast transcriptional 
activator required for the high-level expression of 
many genes (Laurent et al., 1990) . The product of the 
ini-l gene may serve as an internal receptor for the 
10 HIV-l IN, and may be responsible for targeting 

integration to active regions of the chromosome. 
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Summarv of the Invent-ion 

This invention provides an isolated nucleic acid 
encoding an integrase interactor 1 gene (ini-1) , The 
invention further provides a purified polypeptide 
comprising naturally- occurring ini-1. The invention 
also provides for the purified polypeptide possesses 
part or all the amino acid sequence of human ini-l as 
shown in B'igure 4 or any naturally occurring allelic 
variant thereof. The invention further provides 
methods of determining whether a compound is capable of 
interfering with the formation of a complex between a 
retrovirus integrase protein and an ini-1 protein. 

Finally, the invention provides for a method of 
disrupting a retrovirus life cycle in a mammal which 
comprises administering to the mammal a compound which 
is capable of disrupting a retrovirus integrase 
protein^ini-l protein interaction so as to thereby 
disrupt the retrovirus life cycle. 
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Brief DeBerint-in n of th«. PimiT-oB 

Figure 1. Interaction of IN mutants with ini-1. Bars 
in the diagram indicate the regions retained in various 
GAli4DB-IN mutants tested for their ability to interact 
with a GAI^AC-ini-l fusion in the yeast strain 
GGY1::171. Yeast were cotransf ormed with plasmids 
encoding each GAL4DB fusion and GAI^AC-ini-l and 
assayed for the production of B-galactosidase . 
Deletions pMAA18-273 (Kalpana and Goff, 1993) were 
tested for IN-IN interaction in the context of GAL4AC 
fusions along with a GAL4DB-IN fusion. The rest of 
mutants were tested for IN-IN interactions when fused 
to either GAL4DB or GAL4AC and against a partner ' 
containing either the same mutant or the wild-type IN; 
15 the indicated result was obtained in all these 

settings. Gray bars indicate the GAL4 portion of the 
fusion protein; black portions indicates the IN 
portion; the blank portion of the bar indicates Che 
deleted portion. The substitution mutations are 
20 indicated by the residues on top of the relevant bar. 

The residues are H His. C = Cys, D = Asp, e = Glu, V 
= Val, N = Asn and S = Ser. The deletion junctions are 
indicated by the residue at the junction. '++' « dark 
blue; • + ' = blue; = white colony phenotype in the 

25 X-Gal assay. 

Figure 2A. Northern analysis of human tissues. 
Northern blot probed with ini-l cDNA insert isolated 
from pD2.1. Each lane contains about 2 ug of poly (A) - 
30 selected mRNA. Lane l: peripheral blood lymphocytes; 

2: colon; 3: small intestine: 4: ovary; 5: testis; 6: 
prostate ; 7 : thymus ; 8 : spleen.. 

Figure 2B. The blot of Figure 2A after stripping and 
35 reprobing with a human actin cDNA probe. 

Figure 2C. Northern analysis of human cell lines. 
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Northern analysis of total rnAs from human cell lines 
hybridized with pD2.1 probe. Lane 1 HeLa; lane 2. 
CB33; lane 3:Hut78. The amount of RNA loaded in each 
lane is not equivalent. 

Figure 3. Overlapping cDNA clones encoding ini-1. The 
top bar (pD2.l) indicates the cDNA insert isolated from 
the yeast screen, pini-1 to 21 are from a ZAPgtll-HeLa 
CDNA library and plNI.gt from Agtll-HeLa cDNA library. 
T7 and T3 indicates the relative position of T7 and T3 
promoters with respect to cDNA inserts in the 
pBluescript vector. 

Figure 4. Sequence of cDNA clone encoding ini-1 (SEQ 
15 ID NO:l). Complete sequence deduced from the 

overlapping ini-2 cDNA clones. The A nucleotide of the 
first methionine codon was considered nt«l. Amino acid 
residues are numbered on the right side of the diagram 
and nucleotides on the left (SEQ ID NO: 2) . Potential 
poly (A) addition signal AATAAA is underlined and the 
start and stop codons are highlighted. The poly (A) 
stretch in clone pINl.gt is indicated by the stretch of 
As in the middle of 3' non-coding region. Stop codons 
are indicated by -***-. The heptad repeat of 
leucine/valine residues are highlighted. The potential 
N- linked glycosylation sites are circled. 

Figure 5A. Alignment of ini-1 with SNF5 . Schematic 
alignment. The blocks of highest similarity are 
shaded, and the % identity given below. The glutamine 
and proline -rich regions of SNF5 are indicated. 



20 
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30 



Figure 5B. A central portion of the ini-i amino acid 
sequence is shown aligned with that of the yeast SNF5 
35 sequence (SEQ ID NO: 3-4) . Residues which are identical 

between the two sequences are indicated by shading. 
The three regions that show high degree of sequence 
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similarity between che two proteins (33-50% identity) 
are indicated by the bars underneath. 

Figure 6A. Interaction of IN with GST-ini-1 in vitro. 
Coomassie- stained SDS/PAGE of the recombinant proteins 
expressed in bacteria and purified by affinity to 
glutathione- agarose beads. 

Figure 6B. Interaction of IN with GST-ini-1 in vitro. 
The proteins bound to beads were used to specifically 
bind recombinant IN from a bacterial lysate, and the 
bound proteins were analyzed by Western blot with IN- 
specific antibodies- IN: lysate of bacterial cultures 
expressing IN; control: control bacterial lysate not 
expressing IN. Beads: glutathione beads alone; GST: 
GST bound to glutathione beads; GSTIni : GST*ini-l bound 
to glutathione beads. The position of the IN protein 
is indicated by the arrow. Molecular weight standards 
are indicated on the left. 

Figure 6C. Interaction of IN with GST-ini-l in vitro. 
Effect of SDS cund detergents on IN-Ini interaction - 
IN-ini-l complexes on beads were washed with buffer 
containing various concentrations of SDS and NP-40, and 
the remiaining proteins were analyzed by Western blot 
with antibodies to IN. The concentration of SDS and 
NP4 0 are indicated above each lane. 

Figure 7A. Effect of salt on the interaction of IN 
with ini-1. Coomassie-stained gel of the bound 
proteins . 

Figure 7B. Effect of salt on the interaction of IN 
with ini-1. Western analysis of a duplicate gel using 
antibodies to IN. Lanes are as in Figure 7A. Various 
concentration of NaCl used in the binding assays are 
indicated above the lanes. 
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Figures 8A-8D. Stimulation of IN joining activity by 
GSTInil and maniTnalian Inil extract. All joining 
reactions in Figures 8A-8D were carried out as 
described in Khavari et al . , (1993) and Muchardt et 
5 al., (1993), and contained 15 ng IN per reaction. 

(Figure 8A) IN joining reactions carried out with or 
without the addition of GST and GST-InilF. Lanes 4-6 
contained 50, 150, and 450 ng of GST-InilF 
respectively; lanes i and 2 contained 150 and 450 ng of 
10 GST. (Figure 8B) Effect of Tnammalian Inil extracts on 

the IN joining reactions. To isolate the SWI/SNP 
complex, a rat liver nuclear extract was prepared 
according to Gorski et. al., (1986) and fractionated on 
a phosphocellulose Pll column (Whatman) . The 0.5M salt 
15 fraction from this column was diluted and loaded onto 

a DEAE-52 column (Whatman). A 0.3M KCl eluate from 
this column was further fractionated on an S-300 
column (Pharmacia) and the excluded volume containing 
Brgl and Inil was collected. Brgl and Inil co- 
20 fractionated throughout the purification as determined 

by Western, analysis using Brgl and Inil antibodies. 
Depleted extracts were prepared by passing the Inil 
fraction through a Brgl affinity column and the flow- 
through was collected. Lane 3 contained 1 ul of Inil 
25 extract, containing approximately 1.5 ng of Inil 

protein as assessed by Western analysis using Inil 
antibody. Lanes 4-6 contained 1, 2, and 4 ul of 
depleted extract, respectively. Total protein 

concentration in the depleted extract was approximately 
30 half that before ' depletion . (Figure 8C) Effect of 

increasing concentration of target DNA on the 
stimulation of joining activities. The target DNA 
concentration used were 10 (lanes 1, 2 and 5), 30 
(lanes 3 and 6) and 9 0 (lanes 4 and 7) ng per 3 0 ul 
35 reaction, respectively. Lanes 5-7 contained 2 ul of 

Inil extract , containing approximately 3 ng of Inil . 
(Figure 8D) Effect of increasing concentration of IN 
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on the activity of nuclear extract. The concentration 
of IN used were 5 ng (lanes 2 and 6) , 15 ng (lanes 3 
and 7) , 4 5 ng {lanes 4 and 8) and 145 ng (lanes 5 and 
9) per 30 ul reaction, respectively. Lanes 6-9 
contained 2 ul of Inil extract containing approxintately 
3 ng of Inil. 
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Detailed Degg ription of the Invention 

This invention provides an isolated nucleic acid 
encoding an integrase interactor 1 gene (ini-1) . In one 
embodiment of this invention, the isolated nucleic acid 
is DNA encoding the integrase interactor 1 gene that is 
free of one or more introns present in genomic DNA. In 
other embodiments of this invention, the isolated 
nucleic acid sequence described herein are cDNA or 
genomic DNA. The DNA may be labelled with a detectable 
moiety selected from a group consisting of a 
fluorescent label, a radioactive atom, and a 
chemi luminescent label. 

In one embodiment of the invention replicable vectors 
which comprise the nucleic acid described herein are 
also provided. The replicable vectors include those 
where the nucleic acid is free of introns. Suitable 
vectors comprise, but are not limited to, a plasmid or 
a virus. 

The DNA sequence described and claimed herein is useful 
for the information which it can provide concerning the 
amino acid sequence of the polypeptide. The sequence 
is useful for generation new cloning and expression 
vectors, transforming and transfecting prokaryotic, 
eucaryotic and bacterial host cells, and new and useful 
methods for cultured growth of such host cells capable 
of expression of the polypeptide and related products. 

The invention further provides a purified polypeptide 
comprising naturally-occurring ini-1, the polypeptide 
may be the product of prokaryotic or eukaryotic 
expression of an exogenous DNA sequence. The exogenous 
DNA sequence is a cDNA or a genomic DNA sequence. The 
exogenous DNA sequence may be carried on an 
autonomously replicating DNA plasmid or viral vectors. 
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In one eTnbodiment the purified polypeptide of ini-1 may 
be human ini - 1 . 

The invention also provides for the purified 
polypeptide possesses part or all the amino acid 
sequence of human ini-1 as shown in Figure 4 or any 
naturally occurring allelic variant thereof . The 
purified polypeptide may have in vivo or in vitro 
biological activity of naturally occurring ini-l. The 
purified polypeptide may be covalently associated with 
a detectable label substance. 

The invention also provides a method of determining 
whether a compound is capable of interfering with the 
formation of a complex between a retrovirus integrase 
protein and an ini-1 protein, which con^rise the 
following steps: 

a) incubating the compound with an appropriate 
ini-1 affinity fusion protein and the 
retrovirus integrase protein; 

b) contacting the incubate of step (a) with an 
appropriate affinity medium under conditions 
so as to bind the ini-l affinity protein 
complex, if such a complex forms; and 

c) measuring the amount of the ini-l affinity 
protein complex formed in step (b) so as to 
determine whether the compound is capable of 
interfering with the formation of the complex 
between the retrovirus integrase protein and 
the ini-1 protein. 

In one preferred embodiment, the retrovirsus integrase 
protein may be HIV-1 IN, the affinity fusion protein 
may be GST- ini-1. The affinity medium may be 
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giutathione- agarose beads. The amount of the affinity 
protein complex formed may be determined using 
monoclonal or polyclonal antibodies. The above method 
may also be performed using a retroviral integrase 
protein fusion. 



10 



In one preferred embodiment the ini-i affinity protein 
complex or the retrovirus integrase affinity protein 
complex is bound to the affinity medium. The ini-1 
affinity protein complex or the retrovirus integrase 
affinity protein- complex is purified and removed from 
the affinity medium and the amount of integrase protein 
or ini-1 protein is determined. The amount of the 
integrase protein or ini-1 protein may be determined 
15 using monoclonal or polyclonal antibodies. The above 

assays may be performed in vivo or in vitro. 

The invention also provides for a method of disrupting 
a retrovirus life cycle in a cell which comprises 

20 contacting the cell with a compound which is capable of 

disrupting a retrovirus integrase protein- ini-1 protein 
interaction so as to thereby disrupt the retrovirus 
life cycle. The compound contacting the cell may be a 
soluble ini-1 fragment, a HIV-l in fragment or a 

25 chemical molecule. The soluble ini-1 fragment may be 

a small peptide of 4 to 20 amino acids in length, in 
one preferred embodiment there may be 6 to 12 amino 
acids. Other fragments may include non-peptide mimics 
of ini-1 fragments. 
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A method of disrupting a retrovirus life cycle in a 
mammal which comprises administering to the mammal a 
compound which is capable of disrupting a retrovirus 
integrase protein-ini-1 protein interaction so as to 
35 thereby disrupt the retrovirus life cycle. The compound 

administered to the mammal may be a soluble ini-l 
fragment, a HIV-i IN fragment or a chemical molecule. 
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The invention provides an isolated cDNA encoding an 
integrase interactor 1 gene (ini-l) having a coding 
sequence substantially the same as the coding sequence 
as shovm in Figure 4 , 

For the above -identified compounds and methods the 
retrovirus may be selected from the following groups. 
Avian leukosi sarcoma. Mammalian C-type, B-type viruses, 
D-type viruses, HTLV-BLV group, Lentiviruses and "Foamy 
viruses. The retroviruses may also be selected from 
the following examples, Rous sarcoma virus (RSV) , Avian 
myeloblastosis virus (AMV) , Avian erythroblastosis 
virus (AEV) , Rous -associated virus (RAV)-l to 50, RAV- 
0, Moloney murine leukemia virus (MO-MIiV) , Harvey 
murine sarcoma virus (HA-MSV) , Abelson murine leukemia 
virus {A-MuLV) , AKR-MuLV, Feline leukemia virus (PeLV) , 
Simian sarcoma virus, endogenous and exogenous viruses 
in mammals, Reticuloendotheliosis virus (REV), spleen 
necrosis virus (SNV) , Mouse mammary tumor virus (MMTV) , 
Mason-Pf izer monkey virus (MPMV) , "SAIDS" viruses. 
Human T-cell leukemia (or lymphotropic) virus (HTLV) , 
Bovine leukemia virus (BLV) , Human immunodeficiency 
virus (HIV-1 and -2) , Simian immunodeficiency virus 
(SIV) , Feline immunodeficiency virus {FIV) , Visna/Maedi 
virus. Equine infectious anemia virus (EIAV) , Caprine 
arthritis -encephalitis virus (CAEV) , Progressive 
pneumonia virus, many human and primate isolates e.g., 
simian foamy virus (SFV) . 

This invention is also directed to pharmaceutical 
compositions comprising therapeutically effective 
amounts of compotinds of the invention together with 
suitable diluents, preservatives, solubilizers , 
emulsifiers and adjuvants. Administering a 

therapeutically effective amount refers to that amount 
which provides therapeutic effect for a given condition 
and administration regime. Such compositions are 
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liquids or lyophilized or otherwise dried formulations 
and include diluents of various buffer content (e.g., 
Tris-HCL, acetate, phosphate) , pH and ionic strength, 
additives such as albumin or gelatin to prevent 
adsorption to surfaces, detergents (e.g., Tween 20, 
Tween 80, Pluronic F68, bile acid salts), solubilizing 
agents (e.g., glycerol, polyethylene glycol), anti- 
oxidants (e.g., ascorbic acid, sodium metabisulfite) , 
preservatives (e.g., Thimerosal, benzyl alcohol, 
parabens) , bulking substances or tonicity modifiers 
(e.g., lactose, mannitol) , complexation with metal 
ions, or incorporation of the material into or onto 
particulate preparations of polymeric compounds such as 
polylactic acid, polyglycolic acid, hydrogels, etc. or 
into liposomes, microemulsions, micelles, unilamellar 
or multilamellar vesicles, erythrocyte ghosts, or 
spheroplasts . Such compositions will influence the 
physical state, solubility, stability, orate of in vivo 
release. Controlled or substained release compositions 
include formulation in lipophilic deposits (e.g., fatty 
acids, waxes, oils) . Also included in this invention 
are particulate compositions coated with polymers 
(e.g., poloxamers or poloxamines) . Other emboidments of 
the compositions of the invention incorporate 
particulate forms protective coatings and permeation 
enhancers for various routes of administration, 
including parenteral, pulmonary, nasal and oral. 

The following examples are offered to more fully 
illustrate the invention, but are not to be construed 
to limit the scope thereof . 
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Isolation o£ cDNAs encoding proteins that interact with 
HrV-1 IN 

To identify iiuman proteins that bind to the HIV-1 
integrase, the yeast two hybrid system was used to 
screen a large library of human cDNAs . In this system, 
the expression of two constructs in yeast --one encoding 
the GAIi4 DNA binding domain {GAI«4DB) fused to one 
protein, and the other encoding the GAL4 activator 
domain (GAL4AC) fused to another protein --results in 
the reconstitution of GAIi4 function if the two proteins 
bind with sufficient affinity (Fields and Song, 1989) . 
The appearance of GAL4 function can be detected by 
monitoring the expression of an integrated reporter 
gene, such as lacZ, downstream of a GAL4 - dependent 
promoter. Previously the system was used to detect 
several interactions between viral and host proteins 
(Ijuban et al-, 1992; Lioban et al . , 1993), and in 
particular to detect IN- IN multimerization (Kalpana and 
Goff , 1993) . 

To generate a library of GAL4 activator domain- cDNA 
fusions, the inserted sequences of a human cDNA library 
derived from the HI*60 macrophage/monocytic cell line 
were excised from the original phage vector and 
transferred in bulk to a plasmid vector. Six different 
pools of plasmids were prepared, each containing 
100,000 to 500,000 individual clones (Table 1)- 
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Table 1 



Library 
Pools 


Number of 
Original E. Coli. 
Clones 


IN- interacting 
clones recovered 


Pool I 


0.23x105 




Pool II 


0, 5x105 


One 


Pool III 


5 , 00x105 




Pool IV 


3 .00X105 




Pool V 


1,5x105 




Pool VI 


1.00x105 


Two 



Table 1 IS a sunnnary of recombinant clones in various 
pools Of HL60 cDNA library and positive IN- interact ina 
clones obtained from each pool in the two hybrid screen" 

Veast strain GGY1::171, „hich contains an integrated 
reporter gene, was transformed with a mixture of a given 
DNA pool and an equal amount of pGAL4DB-IN DNA, encoding 
a fusion protein consisting of the GAL4 DNA binding 
domain and the entire HIV-1 in (Kalpana and Goff 1993) 
Cotransformants were recovered after selection for 
markers on both plasmid vectors. and colonies were 
replicated to filters and stained with x-gal. 10 blue 
colonies were obtained from a total of 600,000 
transformants screened (Table i) . The plasmids were 
rescued from these colonies and retested by 
transformation along with the plasmid encoding GAL4DB-IN 
into GC5Y1::171. Three of these candidate clones 
consistently tested positive upon cotransf ormation : one 
from pool II and two from pool vi . Subsequent analysis 
of these clones {see below) showed that all three 
contained identical cDNA inserts. Thus, an single cDNA 
was identified in this large-scale screen as encoding a 
protein able to interact with the HIV-l IN. The novel 
gene was termed xnl-2 for integrase interactor l. 
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Specificity of the interaction between novel eequencee 
and HIV-1 IN 

Many cDNAs initially isolated as candidates in the two- 
hybrid system do not in fact depend on interaction with 
the partner hybrid protein, but instead activate 
expression of the indicator gene through other means 
{Luban et al., 1993). To demonstrate a requirement of 
the partner for interaction, the GAIi4AC-ini-l fusions 
were tested for activation in several settings (Table 2) . 
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Transformation of GGyi::17l by the GAL4AC-ini-l 
plasmids alone, without pGAl,4DB-IN, did not activate 
lacZ expression. Cotransf ormation with a plasmid 
encoding GAL4DB alone also did not activate, suggesting 
5 that the ini-1 protein did not interact directly with 

the GAL4 DNA binding domain. To confirm that the 
activation was not restricted to the GAL4 system, the 
DNAs were introduced into strain CTY10-5d, containing 
an integrated GALl-lacZ fusion downstream of the iexA 

10 operator, along with a plasmid encoding a LexA-IN 

fusion, or as control, LexA alone. LacZ expression was 
detected only when the GAL4AC-ini-l protein was present 
with LexA- IN fusions and not with the LexA protein 
alone. These results indicate that activation by ini-1 

15 fusions was not dependent on the particular operator 

and binding domain used to tether the IN protein to 
DNA. 

To determine whether the ini-l protein could interact 
20 with unrelated fusion proteins, the three ini-1 

plasmids were introduced into the appropriate indicator 
strain along with control plasmids encoding a GL4DB- 
Moloney gag fusion or a lexA-Lamin fusion protein. No 
lacZ expression was detected in either of these 
25 Bettings, indicating that activation by the cDNA 

fusions was specific for the HIV-^l IN protein (Table 
1) - Thus, in contrast to other screens for interacting 
partners with other proteins, where many RNA-binding 
proteins were initially detected, there were no false 
positive clones recovered with IN. The results suggest 
that the original GAL4-IN construct was not prone to 
interaction with false positives, but bound tmiguely to 
a human protein encoded by a single cDNA. 



30 



35 



The central domain of XN is required for interaction 
with ini-1 

The two hybrid system has been previously used to 
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define the minimal region of IN required for IN- IN 
interactions, finding that the central core region of 
the protein was necessary for multimerization (Kalpana 
and Goff, 1993). To determine the region of IN 
required for binding to ini-l, a panel of mutants of 
PGAL4DB-IN were tested containing deletions and point 
mutations for activation in the presence of GAL4AC-ini- 
1- Mutants lacking either the N- terminal domain of IN, 
containing a putative zinc finger region, or the C- 
terminal domain, retained their ability to bind to ini- 
1. (Figure 1). Two larger C- terminal deletions, 
removing part of the central core and eliminating IK- IN 
interactions, did not affect In-ini-l interaction. In 
addition, a variant containing a point mutation in the 
C- terminal region of IN that blocked IN- IN interaction 
(Kalpana and Goff, unpublished date) was still positive 
for IN-ini-l interaction. ; Thus, the IN-ini-l 
interaction requires less of the IN central and C- 
terminal domains than the IN- IN interaction. Two 
mutants of IN with point mutations in the N-terminal 
zinc finger region were also tested. While these 
mutants still carry out IN-IN interactions, they were 
both defective for the In-ini-l interaction. Thus, 
binding to ini-1 seems to require the N-terminal zinc 
25 finger region of IN. While the two interaction 

domains--that for IN-IN dimerization and that . for In- 
ini-l interaction- -may overlap, the IN-ini-l domain 
seems to be more N-terminal. 



Expression of the iui-l tnKNA in mcunmalian cells 



The cDNA inserts recovered in the GAL4AC plasmids were 
derived from mRNAs of the HL60 human monocytic- 
myelocytic cell line, suggesting that the gene must be 
35 expressed in at least moderate levels in this tumor 

line. The sequences present in the cDNA insert might 
include only a portion of the complete mRNA. To 
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determine how widely the ini-j mRNA was expressed, and 
to determine the size of the full-length transcript, 
RNAs were isolated from HeLa cells, a human B-cell 
tumor line (CB33), and a human T-cell line (Hut78) , and 
analyzed by Northern blot hybridization using an ini-1 
probe (Figure 2) . RNAs from all three lines contained 
a single major species detected with the probe, 
migrating at approximately 2.0 kb. In addition, the 
HeLa and CB33 lines contained a minor species migrating 
at approximately 4.0 kb. To determine whether the ini- 
1 gene was expressed in normal tissues, RNAs isolated 
from peripheral blood lymphocytes, colon, small 
intestine, ovary, testis, prostate , thymus and spleen 
were separated by electrophoresis, blotted and probed 
as before {Figure 2) . All 8 tissues expressed 
sxxbstantial levels of the 2 . 0 kb mRNA. The level of 
expression of the mRNA was similar in all the tissues 
tested. In addition to the major mRNA species, long 
exposures of the autoradiographs revealed low levels of 
a species migrating at 1.25 kb present in the spleen, 
and similarly low levels of a species migrating at 
about 4 kb in the thymus, prostate and testes. These 
results suggest that the gene is very widely, and 

possibly ubiquitously, expressed, and that the major 
transcript in all tissues is approximately 2.0. kb in 
length. Additional transcripts with alternative 
structures, or transcripts from closely related genes, 
may be present in some tissues. 

Isolation of cDNAs spanning the complete lni-1 coding 
region and predicted sequence of the ini-1 protein 
The cDNA inserts in the three GAIi4AC plasmids recovered 
were examined by restriction mapping and partial 
sequence analysis, and all were found to consist of the 
identical l.O kb fragment, presumably from sibling 
clones in the original phage library. To isolate 
longer cDNAs , this fragment was excised from the 
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Plasmid and used as a probe to screen two phage cDNA 
lxbrar.es of HeLa cell mRNA. one made in the XZapli 
vector and one in Xgtli . 20 clones were recovered from 
approximately 500, 000 clones of the XZapll library, and 
the sxx largest inserts were excised from the vector 
Four of these had overlapping restriction maps (Pig^.re 
3) consistent with that of the probe DNA and were 
subjected to sequence analysis. 12 clones were 
recovered fron. the Xgtll library but no inserts were 
larger that the earlier clones; one of these cDNAs was 
also sequenced. The DNA sequences obtained could be 
readxly aligned, and spanned 1.85 kb, nearly the size 
Of the full-length mRNA detected by Northern blots 
(Figure 3) . 



15 



25 



The DNA sequence from the clones contained several 
unusual features (Figure 4; SEQ ID NO:l). First, the 
sequence was extraordinarily GC-rich and included 
several long stretches of pure GC runs. These features 
made determination of the sequence by dideoxynucleotide 
methods difficult, and several regions could only be 
read from smaller subclones that presumably removed 
secondary structures from the DNA templates. The 
sequence revealed a single long open reading frame of 
385 codons, curiously beginning with a tandem array of 
four consecutive ATG codons . The first ATG of the 
array l.es in a good match to the consensus sequence 
for translational initiation (Kozak. 1991). These 
codons are likely to represent the rue start sites for 
translation, since termination codons are found 
upstream of these atgs. The significance of the 
presence of these tandem methionine codons remains 
unclear. The one clone from the Xgtll library 
(plNI.gt) contained a stretch of poly<A) residues at 
the 3' junction adjacent to the vector, and three of 
the clones from the XZAPli library had 3' junctions at 
or upstream of this position, such that they could have 
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been derived from a similar mRNA. Examination of the 
sequence upstream of the poly (A) stretch revealed the 
presence of a perfect consensus polyadenylation signal, 
AATAAA, at -25 bp relative to the poly (A). These 
results suggest that most of the ini-l mRNAs are 
processed by cleavage and polyadenylation at this 
position. One cDNA clone {plNI.21), however, extended 
beyond this region without poly (A) sequences. This 
clone suggests that some mRNAs are of extended length 
and arise through use of alternative poly (A) addition 
sites further downstream. These RNAs could possibly 
account for the longer mRNAs observed in Northern blots 
of mRNAs from various tissues. One clone, (pINI.9), 
lacked a short stretch of 27 bp {nt 206-232) near the 
5' end of the coding region. This clone might have 
arisen from an alternatively spliced mRNA lacking an 
internal exon . 

The long open reading frame predicts the formation of 
a protein of 44,131 daltons containing 385 amino acids. 
The sequence revealed the presence of a heptad repeat 
of three leucine residues near the amino- terminus of 
the encoded protein; these residues could potentially 
form a leucine zipper structure. While these sequences 
might be important for multimimerization, interactions 
with other proteins, or for the normal function of the 
ini'l, these structures can be eliminated as important 
for interaction with the IN protein since they are not 
present in the original yeast plasmid clone that 
demonstrated binding to IN. The predicted sequence 
includes no amino -terminal secretion signals, no 
transmembrane segment, and no strikingly acidic or 
basic regions . There are three potential sites for 
addition of N- linked sugars (Figure 4) . The predicted 
pi of the protein is 6,15, 
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ini-1 lias sequence similarity to SNF5 

Comparison of the predicted sequence of ini-l with the 
known sequences in the GenEmbl data base revealed a 
single significant match, the SNF5 protein of S . 
cerevisiae . encoding a transcriptional activator 
protein (Abrams et al., 1986; Laurent et al . , 1990; 
Neigeborn and Carlson, 1984) , SNF5 is a nuclear 
protein thought to act in a complex with several other 
proteins including SNF2/SWI2, SNF6, SWIl, and SWI3 , to 
activate target gene expression (Laurent et al . , 1991; 
Peterson and Herskowitz, 1992) . The alignment of ini-l 
with the SNF5 sequence displayed three regions of close 
similarity, with 33-55% sequence identity and 41-71% of 
conserved residues (Figure 5A and SB; SEQ ID NO: 3-4) . 
All three regions lay in the central portion of the 
SNF5 sequence rich in changed amino acids and the 
flanking N-and C-terminal portions of the yeast gene 
were not conserved in the human gene. In particular, 
the proline- and glutamine-rich segments of the yeast 
protein were not retained. Based on the striking 
similarity between the yeast and human genes in the 
core coding region, the ini-I may be a human homologue 
of the yeast SNF5 gene. 

IN binds to ini-l ±n vitro 

To demonstrate that ini-l interacts directly with IN in 
solution, binding studies between recombinant proteins 
were carried out in vitro. The ini-J cDNA from plasmid 
pD2.1 was inserted into plasmid pGEX and expressed as 
a glutathione S-transf erase fusion protein in coll. 
Lysates of the bacteria were prepared, and the GST-ini- 
1 protein was affinity purified on glutathione agarose 
beads (G-beads) . The beads were washed extensively to 
remove nonspecific proteins. To ensure that the G£T- 
ini-1 proteins were successfully expressed and bound to 
the beads, the proteins on the beads were recovered by 
boiling in SDS, and examined by SDS-PAGE (Figure 6A) . 
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A novel protein of the expected size (60 kd) was 
recovered from lysates containing the GST-ini-l 
protein, and represented 70-80% of the total protein. 

The ittmiobilized ini-l was used as an affinity matrix 
for binding of IN. The HIV-l m protein was expressed 
in E. coli from the T7 promoter after induction of the 
T7 polymerase, and soluble IN protein was extracted 
from inclusion bodies with buffer containing high salt. 
These lysates were then incubated with G-beads alone. 
G-beads with GST alone, or G-beads with GST-ini-l. the 
beads were washed extensively, and the bound proteins 
were recovered with SDS. The eluted proteins were 
separated by SDS-PAGE, blotted to nitrocellulose, and 
visualized with polyclonal antibodies specific for HIV- 
1 IN {Figure 6B) . The results showed that the 
recombinant IN bound efficiently to the ini-l beads and 
not to the control GST beads or the beads alone. 

20 TO further characterize the IN-ini-1 interaction, 

bindxng experiments were repeated under various 
conditions (Figures SB and 6C) . Binding was observed 
over a wide range of salt concentrations, and was 
detected even in the presence of 1 M NaCl . The IN was 
retained by the ini-l beads when washed with buffers 
containing 0.5% NP40 or 0.1% Triton X-100. The 
interaction was disrupted, however, by the addition of 
0.1% SDS, suggesting that denatured IN and ini-l 
proteins could not bind. 
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inl-1 acts ae a transcriptional activator in yeast when 
expressed as a DNA binding domain fusicm protein 

The yeast SNF5 protein is a transcriptional activator 
required for the high-level of expression of many genes 
an yeast. Though the protein has not been shown to 
bind to DNA directly, it is capable of activating a 
reporter gene when artificially tethered to DNA by 
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fusion to the lexA DNA binding domain {Laurent et al . , 
1990) . To determine whether ini-1 could also act as a 
transcriptional activator in this setting, a construct 
encoding a fusion of GAL4 DNA binding domain -ini-1 was 
generated and expressed in an indicator strain 
containing a GALl-lacZ reporter. The transf ormants 
expressed high levels of /3-galactosidase as judged by 
staining with x-gal/ while control transf ormants 
expressing only the GAL4DB or GAL4AC- ini-1 protein did 
not. Thus, like SNF5, the human ini-1 protein can 
activate transcription in yeast. 

The ini-1 -XN interaction 

The two -hybrid system has been used to seek human 
proteins that might be involved in retroviral 
replication. The novel gene identified in this screen, 
ini-1, encodes a protein which is capable of binding 
the HIV-1 IN both in vivo and in vitro. The fact that 
all three clones recovered in the screen were 
identical, and that no other clones were identified in 
a large number tested, suggests that ini-1 is the major 
human protein capable of binding to IN. It is 
noteworthy that there were no false positive clones at 
all detected in this screen, suggesting that the 
GA1,4DB-IN fusion used here did not allow interactions 
to the GAL5 region or other proteins that often produce 
false positives. The binding seemed to be very 
specific, and could be observed in the setting of 
several fusion constructs including either the GAL4 or 
LexA binding domains. The interaction measured in 
vitro was tight and was resistant to high salt, 
suggesting that it may involve hydrophobic contacts on 
the two partners. The binding in solution was also 
specific, with no significant binding of IN to GST or 
GST-cyclophilin proteins (Luban et al., 1993) tested as 
controls . 
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The region of IN required for binding to ini-1 was a 
portion of Lhe central domain; the very N-and C- 
terminal regions were dispensable- The essential 
region for interaction to ini-1 was distinct from chat 
for multimerization of IN, apparently lying more toward 
the N- terminus of the protein. Mutants of IN that 
showed differential effects on the two interactions 
were readily obtained. It is possible that the ini-1 
protein can bind to a multiraer of IN and stabilize 
multimer formation, or it could block or compete for IN 
multimerization- ini-1 could stimulate concerted 
joining of both termini into target DNAs, accelerating 
functional integration reactions; or, alternatively, it 
could inhibit concerted joining of two termini of the 
15 viral DNA to the target sequence, acting to restrain 

nozTnal retroviral integration. The function of ini-1 
can be explored through analysis of its effects on 
various in vitro integration activities. 

20 Targeting retroviral integrations 

The presence of a protein like ini-1 in an infected 
cell able to bind the HIV-l IN could be responsible for 
targeting proviral insertion to selected sites in the 
genome. The phenomenon of non-random integration of 

25 retroviral and retrotransposon DNAs is well- 

established, but the mechanisms by which targeting 
occurs remain uncertain. Insertions seem to 

preferentially occur into transcriptionally active 
regions, and perhaps into open chromatin {Rohdewohld et 

30 al., 1987; Vijaya et al., 1986). In the case of the 

yeast transposon Ty3, site selection is profoundly 
specific, with insertions almost always occurring at a 
position 16 or 17 bp from the site of initiation of 
polIII transcripts (Chalker and Sandmeyer, 1990; 

35 Chalker and Sandmeyer, 1992) . Analysis of mutant 

promoter sequences and of hybrid target sites strongly 
suggest that nuclear protein complexes including 
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TFIIIIA, TFIIIB, and TFIID are responsible for site 
selection, and for precise positioning of the insertion 
into the promoter (Sandmeyer et al., 1990). In the 
case of the transposon Tyl, site selection is more 
relaxed, but analysis of a large number of insertions 
into yeast chromosme III suggests that insertions tend 
to occur within regions clustered within 400 bp of 
polIII genes {Ji et al., 1993). Such preferences might 
be mediated by the accessibility of stretches of DNA, 
or by interactions of the transposon- IN complex with 
chromatin of other DNA-bound proteins. The existence 
of a mammalian protein with high affinity for the HIV-l 
IN is consistent with its playing a similar role in 
site selection for retroviral insertion. 

Function of the ini-l protein: reorganization of 
chromatin structure 

SNF5 is a transcriptional activator in yeast, and is 
required for transcription of many unrelated genes such 
as SUC2. HO. INOl, PHOS, and GALl, 7 and 10. In 
addition, it is required for the function of many gene- 
specific activators, including GAL4, Bicoid, and the 
glucocorticoid receptor (Laurent and Carlson, 1992; 
25 Yoshinaga et al., 1992). Genetic experiments suggest 

that the yeast SNF5 protein acts in a enormous complex 
with the products of the SWIl, SNF2/SWI2, SW13 , SNF6, 
and possibly other genes (Laurent et al., 1991; 
Peterson and Herskowitz, 1992), and co- 
immunoprecipitation studies using antibodies to yeast 
SNF5 confirm its presence in a large complex. The SNF2 
subunit of the complex has domains similar in sequence 
to DNA helicases (Davis et al . , 1992; Laurent et al . , 
1992), and has been shown to exhibit DNA-dependent 
35 ATPase activity (Laurent et al., 1993). Mammalian 

homologues of the yeast SNF2/SWI2 products have 
recently been identified (Khavari et al , , 1993; 
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Muchardt and Yaniv. 1993; Okabe et al - , 1992) 

The SNF and SWI transcription factors may act by- 
helping to reorganize chromatin structure (for review, 
5 see Winston and Carlson. 1992) - Deletions of one copy 

of the genes encoding histones H2A and H2B can suppress 
the defects in Ty and SUC2 transcription caused by 
8nf2. and 5 mutations (Clark -Adams et al . , 1988; Happel 
et al., 1991), and these suppressors probably act by 
10 inducing changes in the chromatin structure as assayed 

by microccal nuclease digestion experiments (Hirschhom 
et al., 1992). Other suppressors of snf and swi 
mutations have been identified as alleles of a gene 
encoding histone H3 (cited in Peterson and Herskowitz, 
15 1992) , and of a gene encoding a nonhistone DNA binding 

protein similar to HMGl (Kruger and Herskowitz, 1991) . 
These observations suggest that the normal role of the 
SNF and SWI genes may be to alter the arrangement of 
nucleosomes on target genes to facilitate their 
20 transcription. The unexpected sequence similarity of 

the ini-1 protein to SNF5 is intriguing: the similarity 
implies that ini-l may be a novel transcriptional 
activator in human cells, and may act in a complex to 
decondense chromatin. The ability of the human 
25 sequence to activate a reporter gene in yeast when 

tethered to DNA lends further support to this notion. 
Such a role is also consistent with its affinity for 
HIv-1 IN, and would suggest that ini-i might indeed 
account for the propensity of retroviral DNA to insert 
30 into active genes. 

Finally, the identification of a host protein e:S 
interacting with the HIV-1 IN raises the possibility 
that it may be used as a novel route to inhibit viral 
35 replication. If the protein serves to stimulate 

integration, then drugs which could block the ini-l-IN 
interaction might retard viral spread. In addition, it 
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might be possible to generate dominant negative alleles 
of in^^i, perhaps encoding small fragments of the 
protexn. that bind inappropriately to IN and block its 
activity. 

5 

The retroviral integrase protein (IN) is responsible 
for the insertion of the viral DNA into host 
chromosomal targets. The two hybrid system has been 
used to screen a human cDNA library expressed as GAL4- 
10 fusion proteins in yeast for gene products that 

interact with the human immunodeficiency virus type 1 
IN." The screen led to the recovery of three 
independent isolates of the same gene from 
approximately 10« colonies. The protein encoded by this 
gene bound tightly to the HIV-i integrase in vitro 
The sequence of the gene suggests that the novel 
protein is a human homologue of yeast SNF5 a 
transcriptional activator required for high livel 
expression of many genes. The new gene is termed ini-i 
for integrase interactor i, encodes a nuclear receptor 
for incoming viral integration complexes, and may be a 
component of the long-sought mechanism for biased 
target site selection during provirus integration. 

Bacterial and yeast strains: Yeast strain GGYl.-iTi 
(MAT aleu2-3,ll2 i2is3-200 met-cyrl ura3-52 ade2 ga24A 
9^180^ URA3::GAL1.1^CZ) (Fields and Song, 1989) 
contains an integrated GALl-lacZ reporter gene; CTY 10- 
5d (MATa ade2 trpl-901 leu2-3, 112 his3-200 galSO- 
URA3::LexA-LacZ) contains an integrated GALl-lacZ gene 
with lexA operator. ^-galactosidase assays, both in 
liquid cultures and on nitrocellulose lifts, were 
carried out as published with minor modifications 
<Chien et al., 1991). E. aali strains DH5a (BRL) 
XLlblue andSURE (Stratagene) were used for subcloning 
plasmids; strain BL21{DE3) was used for the expression 
of recombinant proteins from T7 promoters. 
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Construction of various recoiabiaant plasmidfi: 

Construction of plasmids pMAI (encoding GAi/4DB-IN 
fusion) , and PGADI (encoding GAL4AC-IN) , pSHIN (LexADB- 
IN fusion) and pMA-MG (encoding the GAL4DB fused to the 
Moloney MuLV Gag protein) have been previously 
described (Kalpana and Goff, 1993; I»uban et al., 1992) , 
Plasmids pGVKlO (expressing the GST-ini-1 fusion 
protein) and pMAI (expressing GAL4DB-ini-l) were 
constructed by transfer of the EcoRl cDNA fragment of 
the interacting clone pD2.1 to the unique EcoRI sites 
of pGEX-lXT and pMA424, respectively. Construction of 
IN mutants pMAHH, pMACC, pMAAN3, pMAAC2 and pMa/iCS were 
described earlier (Kalpana and Goff , 1993) . The 
remaining IN deletion mutants, pMAAlS to pMAA273, were 
originally isolated as GAIi4AC fusion mutants that 
retained the ability to interact with GAL4DB-IN. The 
BamHI-Sall fragments from these mutants were excised 
from the GAL4AC plasmid and transferred into pMA424 . 
Isolation of pMAMS, encoding a mutant IN defective for 
IN- IN interaction, will be described elsewhere. 

Construction of HL60 eDNA library fused to the 
activation domain of GbAL4 : 

The hIj60 cell cDNAs were excised from a XZap HL6 0 cDNA 
library (Stratagene catalogue # 936214) . The original 
XZap library encompassed about a million recombinant 
phage clones. To ensure that complexity of the 
original library was retained, a plate lysate of the 
phage library^ was prepared by plating 10'' phage; phage 
particles were isolated by PEG precipitation and two 
consecutive steps of CsCl gradient centrif ugation. DNA 
was isolated from the total phage by standard methods. 
About 100 ^g of DNA was digested with Not I and Xhol, 
separated on agarose gels and inserts 0.2- 3.0 kb in 
size were isolated by electroelution. The cDNA inserts 
were ligated to the pGADNot vector (Luban et al, 1993) 
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digested with NotI plus Sail and phosphatase- treated. 
DHSof cells were transformed with the ligation products 
and the transformants from six individual batches of 
100,000 to 500,000 colonies each were pooled separately 
in LB/Amp (KGLI, pool I to Pool VI). This unamplified 
library in pGADNot vector was aliguoted into small 
vials and stored frozen until further use. The ration 
of non- recombinants to recombinants in the library was 
determined by comparing the number of transformants 
obtained with self ligated vector to that obtained with 
vector ligated to insert; and by examining plasmids 
from several individual colonies to determine the 
presence of insert. Both these tests indicated that 
there were >95% recombinants in the library. The 
plasmid library DNA was isolated from i 1 cultures of 
each pool by Quiagen columns. This DNA was used for 
transf oannation into yeast strain GGYl::17i. 

TraiiBformatioa of yeast and acreeaing for interacting 
20 clones 

Overnight cultures of GGY1::171 were diluted 1:50 or 
1:100 in YPAD (YEPD supplemented with 30 /ig/ml of 
adenine) and incubated at 30»c until the OD^„„ reached 
0.25-0.4. The cells were pelleted, washed once with 
l/lOth volume of lOO mM LiAc/lO mM TE, and resuspended 
in l/200th volume of the same buffer. The cells were 
further incubated with shaking for 1 hour at 30«-C. The 
competent cells were- incubated with i>io of plasmid 
DMAS encoding GAL4DB and GAL4AC fusions. 2 0 fig of 
sonicated salmon sperm carrier DNA (Sigma, catalogue # 
D-9156) and 40% PEG in LiAC/TE with agitation at 30»C 
for 3 0 minutes. After the PEG treatment, the cells 
were pelleted and resuspended in 1 ml of YPAD and 
incubated further for i hour at 30°. The post- 
35 incubation step increased the efficiency of co- 
transformation by about 10 fold. Cells were pelleted, 
resuspended in TE and plated on selective medium. 
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Xn vitro binding of GST-ini-1 fusion prot^ein to HIV-1 
IN 

Bacterial extracts containing GST-ini-1 fusion protein 
were prepared as follows. Overnight bacterial cultures 
containing the required plasmid was diluted 1:10 into 
LB /Amp and incubate at 37^C until the 0,D goo was -0.5. 
IPTG (isopropyl-/8-D-thiogalactopyranoside) was added to 
a final concentration of i mM and incubation was 
continued for 3-5 hours. The cells were collected and 
resuspended in buffer Y {50 mM Tris/Cl pH 7 . 5 , 50 mM 
NaCl, 1 mM EDTA, 0.5% NP-4 0 and 1 mM PMSF) . Lysozyme 
was added to a final concentration of l mg/ml and 
incubation was continued on ice for half an hour. This 
lysate was subjected to sonication (3x30 sec bursts) , 
The lysate was clarified in a microfuge for 15 minutes, 
and the supernatant was transferred to a fresh 
microfuge tube. Pre-swollen G-beads were added to thei 
above lysate and incubated at 4*C for 30 minutes with 
gentle rocking. The beads were spun at 1600 RPM in the 
microfuge for 20 sec and the resulting pellet was 
washed three times with excess of buffer Y and 
resuspended in buffer Y to yield a 50% (v/v) slurry. 

Bacterial extract containing HIV-1 IN was prepared as 
follows. Overnight bacterial cultvires of BL2l(DE3) 
containing either one of the plasmids, pT7fll.IN 
(encodes IN under the control of T7 promoter) , and pT7- 
AIN (control plasmid from which In is deleted) were 
diluted 1;10 in LB/Amp and incubated at 37<>C for 1 
hour. IPTG was added to a concentration of imM and 
incubation was continued for 3-5 hours. The cultures 
were pelleted and the pellets were resuspended in 
buffer Y. Lysozyme was added to a final concentration 
of 1 mg/ml and the cells were incubated on ice for 30 
minutes. The lysed bacteria were sonicated and passed 
through a syringe with a 23 Gauge needle several times. 
The insoluble material was collected by cent rifugat ion 
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and resuspended in buffer containing 1 M NaCl and 1 mM 
DTT. and the mixture was subjected to gentle rocking at 
40c for 30 minutes. The resulting solution was spun in 
the microfuge for 3 0 minutes and the supernatant, 
referred to as IM NaCl extract of in. was used for the 
binding assay with GST-ini-1, 



The binding of IN to GST-ini-1 was tested by adding the 
washed G-beads with bound GST-ini-1 to the IM NaCl 
10 extract of IN and incubating for 3 0 minutes at 40c in 

buffer containing 1 M Hepes, pH 7.3, 200 mM NaCl, 5 mM 
DTT, 0.1%f NP-40, 1 mM PMSF and 10 mg/ml BSA. To test 
the effect of salt, the concentration of NaCl was 

varied in the binding buffer from 200 mM to 1 m. The 
15 mixture was incubated at 4«"C for 3 0 minutes and washed 

three times with excess of either buffer Y or buffer Y 

containing various concentrations of NP-4 0 and SDS. 

The resulting beads were boiled in Laemmli buffer and 

subjected to SDS-PAGE in duplicate. The presence or 
20 absence of IN in these binding experiments was 

determined by Western analysis using monoclonal 

antisera to IN. 

screening the phage library to isolate full leagtli 
25 r-eeombinant clones of ini-1 

Two HeLa cDNA libraries, one constructed in XZap II 
vector (Stratagene, Cat. #936201) and one constructed 
in Agtll, were screened using standard methods. The 
CDNA insert from one positive interacting clone 
obtained in the yeast screen was purified, labelled by 
random priming with "p.dCTP, and used as a probe to 
screen about 0.5 X 10* phage of the XZapII library, dna 
isolated from twenty positive cones obtained after 
three rounds of plague purification was subjected to 
restriction analysis. Six positive clones that had the 
largest inserts were chosen for further analysis. The 
recombinant pBluescript phagemids from these six 
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positive XZapII clones were subjected to dn vivo 
excision using the M13 helper phage (Exassist/SOLR 
system. Stratagene, Cat #200253) . 

5 mKNA analyses 

Unfractionated mRNA prepared from HeLa, CB33 and Hut 78 
cell lines were subjected to Northern analysis using 
standard methods. A northern Blot of human mRNAs from 
multiple tissues (Clontech, Palo Alto CA; catalog 
10 #7759-1) was hybridized to a labelled ini-l probe using 

standard methods {Maniatis et al., 1982). 

Detemination of whether Inil protein could affect IN 
Function: 

15 

The enzymatic activities of IN in the presence of 
increasing concentrations of GST-InilF {full length 
Inil) fusion protein. was assayed. A full length cDNA 
clone missing only the first 5 codons was inserted into 
20 PGEX2TK and the full length Inil fusion protein {GST- 

InilF) was isolated using G-beads and eluted with 20 mM 
glutathione. The protein was dialyzed against a large 
volume of storage buffer (25 mM Hepes, pH 7.2, so mM 
NaCl, 0.1 mM EDTA, 1 mM DTT, 20% glycerol, l mM PMSF, 
1 ug/ml each of pepstatin, aprotenin and leupeptin) and 
stored at -70<'C. Recombinant HIV-l IN protein was 
isolated from bacterial cultures carrying plasmid 
pINCSH essentially as described (Drelich et al. 1992) 
with minor modifications. Integrase joining activity 
assays were performed in a total volume of 3 0 ul, and 
contained 1 ng of a double -stranded DNA oligonucleotide 
from the HIV-l U5 terminus, consisting of one strand 
labelled at the 5' end, representing the already- 
processed sxibstrate (seqpience 5 ' -GGATCCGGAAAATCTCTAGCA- 
3'), and its unlabelled complement with extra CA 
dinucleotide overhang at the 5' end; lo ng of 
pBluescript DNA as target; and -15 ng of IN. The 
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reactions were stopped by addition of EDTA to 50 mM 
ranal concentration. The products were treated with 
proteinase K and SDS, and analyzed by electrophoresis 
on a 1% agarose gel. The gel was dried and exposed to 
autoradiography to monitor transfer of the 
oligonucleotide to the relaxed circular and linear 
target DNA. 



To assay IN DNA joining activity, recondbinant IN 
10 protein was incubated with "P-labeled double- stranded 

DNA oligonucleotides corresponding to the U5 terminus 
of the HIV-l viral DNA as substrate, and with unlabeled 
plasmid DNA as target. Aliquots of the reaction 
mixture were removed at various times and analyzed by 
15 agarose gel electrophoresis and autoradiography; 

radioactivity migrating with the relaxed plasmid DNA 
represented integration of the labeled oligonucleotide 
into the target . The addition of increasing levels of 
GST-InilF resulted in a strong and dose -dependent 
20 stimulation of joining activity (Fig. 8A) . Control 

experiments with GST showed no such stimulation, in 
some experiments, addition of very high levels of GST- 
IniF resulted in inhibition. 



25 



To determine whether the native form of mil as present 
in mammalian cells behaved similarly, nuclear extracts 
were prepared and the SNF/SWI complex (mil extract) 
was partially purified. The presence of mil was 
monitored by Western analysis with polyclonal antisera 
30 raised against GST-Inil. mil cof ractionated with Brgl 

and the complex through five different conventional 
columns, and was also retained on Brgl immunoaf f inity 
columns. The addition of increasing amounts of this 
preparation to the joining reactions resulted in potent 
35 stimulation of IN activity {Pig. 8B) . Partial 

depletion of Inil by passage through a Brgl affinity 
column resulted in the removal of most of the 
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stimulatory activity (Fig. SB) . 
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stimulation by the extract was constant over a wide 
range of target DNA concentrations (Fig. 8C) . Maximal 
stimulation of IN activity (10-2 0 fold) occurred when 
the IN-Inil molar ratio in the reaction mixture was 
rougly 5:1. Higher concentrations of Inil resulted in 
no further stimulation but rather slight inhibition. 
Maximal stimulation by native Inil required lower 
concentrations than with the recombinant IN- 
Stimulation by Inil was strongly dependent on the 
Inil: IN ratio, with strongest stimulation at low IN 
concentrations, no stimulation at intermediate 
concentrations, and inhibition at high IN 
concentrations (Fig. 8D) , 

A novel host protein, Inil, can bind the HIV-1 IN 
protein and stimulate its DNA joining activity. The 
protein shows unexpected sequence similarity to the 
SNF5 protein of yeast (Laurent et al . , 1990) which is 
required for the high-level transcription of many 
genes, and for the proper functioning of several gene- 
specific activators (Laurent et al . , 1991; Laurent et 
al,, 1992; Yoshinaga et al . , 1992; Peterson et al., 
1992; and Carlson et al. 1994). Genetic and 

biochemical experiments suggest that SNF5 is part of a 
very large complex of proteins able to promote 
transcription both in vitro and in vivo (Laurent et 
al., 1991; Laurent et al., 1992; Yoshinaga et al., 
1992; Peterson et al . , 1992; Carlson et al , 1994; 
Cairns et al, 1994; and Peterson et al . , 1994). The 
complex, may help reorganize chromatin structure. 
Mutations in snf2 and snfS are suppressed by mutations 
affecting histones H2A, H2B, and H3, as well as a 
nonhistone DNA binding protein similar to HMGl, and 
direct biochemical analysis suggests that the complex 
can alter nuclease sensitivity of chromatin {Hirschhorn 
et al . , 1932; Kruger et al., 1991; and Winston et al . 
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(1992) . The complex has been shown to alter chromatin 
structure and promote binding of seguence-specii^ic DNA 
binding proteins (Kwon et al., 1994; and Imblazano et 
al., 1994). mil is retained on an affinity column 
containing anti-BRGl antibodies, suggesting that it is 
in a complex with BRGl. The sequence similarity, the 
ability of mil to activate a reporter gene when 
tethered to DNA, and its presence in the mammalian 
SWI/SNF complex- -strongly suggest that Inil 
functional homolog of the yeast SNF5 gene. 



is a 



The affinity of Inil for the HIV-l in might account for 
the propensity of retroviral DNAs to insert into active 
genes and their associated open chromatin (Vijaya et 
15 al., {1986); Rohdewohld et al . . (1987); Scherdin et 

al., (1990); Shih et al., (1988); Withers-Ward et 
al. , (1994) ) . upon binding to Inil, the preintegration 
complex could be stimulated to insert the viral DNA 
xnto nearby sites. mil may provide a novel target for 
20 antiviral therapy: virus replication might be bloc)ced 

by drugs that inhibit the IN-lnii interaction or by 
dominant negative alleles of INII that bind 
inappropriately to in and bloclc its activity. 
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(2) I25FORMATION FOR SE^ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1867 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 
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(v) FRAGMENT TYPE: N-terxninal 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) IjOCATION: 70.. 1225 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GCCCCGGCCC CGCCCCAGCC CTCCTGATCC CTCGCAGCCC GGCTCCGGCC GCCCGCCTCT 

GCCGCCGCA ATG ATG ATG ATG GCG CTG AGC AAG ACC TTC GGG GAG AAG 
Met Met Met Met Ala Leu Ser Lys Thr Phe Gly Gin Lvs 
1 5 10 

CCC GTG AAG TTC CAG CTG GAG GAC GAC GGC GAG TTC TAG ATG ATC GGC 
Pro Val Lys Phe Gin l,eu Glu Asp Asp Gly Glu Phe Tyr Met lie Gly 
15 20 25 

TCC GAG GTG GGA AAC TAC CTC CGT ATG TTC CGA GGT TCT CTG TAC AAG 
Ser Glu Val Gly Asn Tyr Leu Arg Met Phe Arg Gly Ser Leu Tyx Lys 

35 40 45 

AGA TAC CCC TCA CTC TGG AGG CGA CTA GCC ACT GTG GAA GAG AGG AAG 
Arg Tyr Pro Ser Leu Trp Arg Arg Leu Ala Thr Val Glu Glu Arg Lvs 
50 55 60 

AAA ATA GTT GCA TCG TCA CAT GGT AAA AAA ACA AAA CCT AAC ACT AAG 
Lys lie Val Ala Ser Ser His Gly Lys Lys Thr Lys Pro Asn Thr Lys 
65 70 75 

-^^^ G'^C ACC CTG TTA AAA GCC 

Asp His Gly Tyr Thr Thr Leu Ala Thr Ser Val Thr Leu Leu Lys Ala 

65 90 

ler ^ vl? ^ "^^^ ^^"^ GAG AAG TAC AAG OCT 

ser Glu Val Glu Glu lie Leu Asp Gly Asn Asp Glu Lys Tyr Lys Ala 
" 100 los 

vll Ifr t"^ ^ ^^'^ '^'^^ ^'^^ GAA CAG AAG GCC 

Val ser He Ser Thr Glu Pro Pro Thr Tyr Leu Arg Glu Gin Lys Ala 

"S 120 

AAG AGG AAC AGC CAG TGG QTA CCC ACC CTG TCC AAC AGC TCC CAC CAC 
Lys Arg Asn ser Gin Trp Val Pro Thr Leu Ser Asn Ser Ser His 
"0 135 

ZI» ^ "^^^ ^'■C AGG AAC CGC ATG GGC 

teu ASP Ala Val Pro Cys Ser Thr Thr He Asn Arg Asn Arg Met Gly 
145 150 ' 

CGA GAC AAG AAG AGA ACC TTC CCC CTT TGC TTT GAT GAC CAT GAC CCA 
Arg Asp Lys Lys Arg Thr Phe Pro Leu Cys Phe Asp Aap His Asp ^ 

155 

GCT GTG ATC CAT GAG AAC GCA TCT CAG CCC GAG GTG CTG GTC CCC ATC 
Ala val He His Glu Asn Ala Ser Gin Pro Glu Val Leu val Pro lie 

175 180 

CGG CTG GAC ATG GAG ATC GAT GGG CAG AAG CTG CGA GAC GCC TTC ACC 
Arg Leu Asp Met Glu lie Asp Gly Gla Lys Leu Arg Asp Ala ?h? 

200 205 



60 
108 

156 

204 

252 

300 

348 

396 

444 

492 

540 

588 

S36 

6B4 
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TG6 AAC ATG AAT GAG AAG TTG ATG ACG CCT GAG ATG TTT TCA GAA ATC 732 
Trp Asn Met Asn Glu Lys L.eu Met Thr Pro Glu Met Phe Ser Glu He 
210 215 220 

CTC TGT GAC GAT CTG GAT TTG AAC CCG CTG ACG TTT GTG CCA GCC ATC 780 
Leu Cys Asp Asp Leu Asp Leu Asn Pro Leu Thr Phe Val Pro Ala He 
225 230 235 

GCC TCT GCC ATC AGA CAG CAS ATC GAG TCC TAC CCC ACG GAC AGC ATC 826 
Ala Ser Ala He Arg Gin Gin He Glu Ser Tyr Pro Thr Asp Ser He 
240 245 250 

CTG GAG GAC CAG TCA GAC CAG CGC 6TC ATC ATC AAG CTG AAC ATC CAT 876 
lieu Glu Asp Gin Ser Asp Gin. Arg Val He He Lys Leu Asn He His 
255 260 265 

GTG GGA AAC ATT TCC CTG GTG GAC CAG TTT GAG TGG GAC ATG TCA GAG 924 
Val Gly Asn He Ser Leu Val Asp Gin Phe Glu Trp Asp Met Ser Glu 
270 275 280 285 

AAG GAG AAC TCA CCA GAG AAG TTT GCC CTG AAG CTG TGC TCG GAG CTG 972 
Lys Glu Asn Ser Pro Glu Lys Phe Ala Leu Lys Leu Cys Ser Glu Leu 
290 295 300 

GGG TTG GGC GGG GAG TTT GTC ACC ACC ATC GOV TAC AGC ATC CGG GGA 1020 
Gly Leu Gly Gly Glu Phe Val Thr Thr He Ala Tyr Ser He Arg Gly 
305 310 315 

CAG CTG AGC TGG CAT CAG AAG ACC TAC GCC TTC AGC GAG AAC CCT CTG 1068 
Gin Leu Ser Trp His Gin Lys Thr Tyr Ala Phe Ser Glu Asn Pro Leu 
320 325 330 

CCC ACA GTG GAG ATT GCC ATC CGG AAC ACG GGC GAT GCG GAC CAG TGG 1116 
Pro Thr Val Glu He Ala He Arg Asn Thr Gly Asp Ala Asp Gin Trp 
335 340 345 

TGC CCA CTG CTG GAG ACT CTG ACA GAC GCT GAG ATG GAG AAG AAG ATC 1164 
Cys Pro Leu Leu Glu Thr Leu Thr Asp Ala Glu Met Glu Lys Lys He 
350 355 360 365 

CGC GAC CAG GAC AGG AAC ACG AGG CGG ATG AGG CGT CTT GCC AAC ACG 1212 
Arg Asp Gin Asp Arg Asn Thr Arg Arg Met Arg Arg Leu Ala Asn Thr 
370 375 380 

GCC CCG GCC TGG T AACCAGCCCA TCAGCACACG GCTCCCACGG AGCATCTCAG 1265 
Ala Pro Ala Trp 

385 



AAGATTGGGC 


CGCCTCTCCT 


CCATCTTCTG 


GCAAGGACAG 


AGGCGAGGGG 


ACAGCCCAGC 


1325 


GCCATCCTGA 


GGATCGGGTG 


GGGGTGGAGT 


GGGGGCTTCC 


AGGTGGCCCT 


TCCCGGTACA 


13B5 


CATTCCATTT 


6TTQAGCCCC 


AGTCCTGCCC 


CCCACCCCAC 


CCTCCCTACC 


CCTCCCCAGT 


1445 


CTCTG6GGTC 


AGGAAGAAAC 


CTTATTTTAG 


GTTGTGTTTT 


GTTTTGTATA 


GGAGCCCCAG 


1505 


6CAGGGCTAG 


TAACAGTTTT 


TAAATAAAAG 


GCAACAGGTC 


ATGTTCAAAA 


AAAAAAAAAT 


1565 


TTCTTAAATC 


TAGTGTCTTT 


ATTTCTTCTG 


TTACAATAGT 


GTTGCTTGTG 


TAAGCAGGTT 


1625 


AGAGTGCACA 


GTGTCCCCAA 


TTGTTCCTG6 


CACTGCAAAA 


CCAAATTAAA 


CAATCCCACA 


1685 
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AAGAATTCTG ACATCAATGT GTTTTCCTCA GTCAGST-CTA TTTCAAGATT CTAGAAGTTC 
CTTTTGTAAA ACTTGCCTTT AAAACTCTTC CTCC7AATGC CATCAGATCT CTTAACATTC 
GCTCACTGTG GGATCTTTCC TCTTAGGTTG AATTTCTACG TGAATATCAA AGTGCCTTTT 



TC 



1"45 
1805 
1865 
1867 



(2) 1NFORI4ATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 385 amino acid6 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Met Met Met Ala Leu Ser Lys Thr Phe Gly Gin Lys Pro Val Lvs 



15 



Phe Gin Leu Glu Asp Asp Gly Glu Phe Tyr Met lie Gly Ser Glu Val 
20 25 30 

Gly Asn Tyr Leu Arg Met Phe Arg Gly Ser Leu Tyr Lys Arg Tyr Pro 

40 45 

Ser Leu Trp Arg Arg Leu Ala Thr Val Glu Glu Arg Lys Lys He Val 

Ala ser Ser His Gly Lys Lys Thr Lys Pro Asn Thr Lys Asp His Gly 

75 eo 

Tyr Thr Thr Leu Ala Thr Ser Val Thr Leu Leu Lys Ala Ser Glu Val 
85 90 95 

Glu Glu He Leu Asp Gly Asn Asp Glu Lys Tyr Lys Ala Val Ser He 

105 

ser Thr Glu Pro Pro Thr Tyr Leu Arg Glu Gin Lys Ala Lys Arg Asn 

120 125 

ser Gin Trp Val Pro" Thr Leu Ser Asn Ser Ser His His Leu Asp Ala 



140 



val Pro cys Ser Thr Thr He Asn Arg Asn Arg Met Gly Arg Asp Lys 

155 160 
l-ys Arg Thr Phe Pro Leu Cys Phe Asp Asp His Asp Pro Ala Val He 

His Glu Asn Ala Ser Gin Pro Glu Val Leu Val Pro He Arg Leu Asp 
" 185 190 

Met Glu lie Asp Gly Gin Lys Leu Arg Asp Ala Phe Thr Trp Asn Met 
" 200 205 

Asn Glu Lys Leu Met Thr Pro Glu Met Phe Ser Glu He Leu Cys Asp 
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210 215 220 

Asp l>eu Asp lieu Asn Pro IJeu Thr Phe Val Pro Ala lie Ala Ser Ala 
225 230 235 240 

lie Arg Gin Gin He Glu Ser Tyr Pro Thr Asp Ser He Leu Glu Asp 
245 250 255 

Gin Ser Asp Gin Arg Val He He l*ys I*eu Asn He His Val Gly Asn 
260 265 270 

He Ser Leu Val Asp Gin Phe Glu Trp Asp Met Ser Glu Lys Glu Asn 
275 280 285 

Ser Pro Glu Lys Phe Ala Leu Lys Leu Cys Ser Glu Leu Gly Leu Gly 
290 295 300 

Gly Glu Phe Val Thr Thr He Ala Tyr Ser He Arg Gly Gin Leu Ser 
305 310 315 320 

Trp His Gin Lys Thr Tyr Ala Phe Ser Glu Asn Pro Leu Pro Thr Val 
325 330 335 

Glu He Ala He Arg Asn Thr Gly Asp Ala Asp Gin Trp Cys Pro Leu 
340 345 350 

Leu Glu Thr Leu Thr Asp Ala Glu Met Glu Lys Lys He Arg Asp Gin 
355 360 365 

Asp Arg Asn Thr Arg Arg Met Arg Arg Leu Ala Asn Thr Ala Pro Ala 
370 375 380 

Trp 
385 



(2) INFORMATION FOR SEQ ID NO:3: 

{i> SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 204 amino acidS 

(B) TYPE: amino acid 

(C) STRANDKDNESS; single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) AMTI-SENSB: N 

(v) FRAOTBNT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0;3: 

Ala Ser Gin Pro Glu Val Leu Val Pro lie Arg Leu Asp Met Glu He 
5 10 15 

Asp Gly Gin Lys Leu Arg Asp Ala Phe Thr Trp Asn Met Asn Glu Lys 
20 25 30 
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Leu Met Thr Pro Glu Met Phe Ser Glu lie Leu Cys Asp Asp Leu Asp 
35 40 45 

Leu Asn Pro Leu Thr Phe Val Pro Ala lie Ala Ser Ala lie Arg Gin 
50 55 60 

Gin lie Glu Ser Tyr Pro Thr Asp Ser lie Leu Glu Asp Gin Ser Asp 
^5 70 75 80 

Gin Ar9 Val He lie Lyo Leu Asn lie His Val Gly Asn He Ser Leu 
85 90 95 

Val Asp Gin Phe Glu Trp Asp Met Ser Glu Lys Glu Asn Ser Pro Glu 
100 105 1X0 

Lys Phe Ala Leu Lys Leu Cys Ser Glu Leu Gly Leu Gly Gly Glu Phe 
115 120 125 

Val Thr Thr lie Ala Tyr Ser He Arg Gly Gin Leu Ser Trp His Gin 
130 135 140 

Lys Thr Tyr Ala Phe Ser Glu Asn Pro Leu Pro Thr Val Glu He Ala 
145 150 155 160 

He Arg Asn Thr Gly Asp Ala Asp Gin Trp Cys Pro Leu Leu Glu Thr 
165 170 175 

Leu Thr Asp Ala Glu Met Glu Lys Lys He Arg Asp Gin Asp Arg Asn 
180 185 190 

Thr Arg Arg Met Arg Arg I^eu Ala Asn Thr Ala Pro 
195 200 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 232 amino acids 

(B) TYPE: amino acid 

(C) STRANBEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 

(iii) HYPOTHETICAL; N 

(iv) ANTI-SENSE: N 

(v) FRAGMENT TYPE: N- terminal 

ix±) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Asn Glu Thr Ser Glu Gin Leu Val Pro He Arg Leu Glu Phe Asp Gin 
5 10 15 

Asp Arg Asp Arg Phe Phe Leu Arg Asp Thr Leu Leu Trp Asn Lys Asn 
20 25 30 

Asp Lys Leu He Lys He Glu Asp Phe Val Asp Asp Met Leu Arg Asp 



B»4SDOCID: «WO.„.. 953»©0«A» I > 



W0 95/319M ^ - PCT/US95/06683 

-55- 

35 40 45 

Tvr Aro Phe Glu Asp Ala Thr Arg Clu Gin 8is lie Asp Thr lie Cys 
SO 55 60 

Gin Ser He Gin Glu Gin He Gin Glu Phe Gin Gly Asn Pro Tyr lie 
65 70 ^5 

Glu I^u Asn Gin ASP Arg beu Gly Gly Asp Asp Leu Arg He Arg He 



85 



Lys Leu Asp He Val Val Gly Gin Asn Gin Leu He Asp Gin Phe Glu 



100 



Trp Asp He Ser Asn Ser Asp Asn Cys Pro Glu Glu Phe Ala Glu Ser 
*^ 120 1.^3 



115 



Met cys Gin Glu Leu Glu X^u Pro Gly Glu Phe Val Thr Ala He Ala 
130 

His ser lie Arg Glu Gin Val His Met Tyr His Lys Ser Ueu Ala Leu 
Leu Gly Tyr Asn Phe Asp Gly Ser Ala lie Glu Asp Asp Asp lie Arg 



1€S 



ser Arg Met Leu Pro Thr He Thr Leu Asp Asp Val Tyr Arg Pro Ala 
180 

Ala Glu ser Lys He Phe Thr Pro Asn Leu Leu Gin lie Ser Ala Ala 
195 200 205 

Glu Leu Glu Arg Leu Asp Lys Asp Lys Asp Arg Asp Thr Arg Axg Lys 
210 215 220 

Arg Arg Gin Gly Arg Ser Asn Arg 
225 230 
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What is claimed is: 

1 . An isolated nucleic acid encoding an integrase 
interactor 1 gene (ini-1) . 

2. The isolated nucleic acid of claim 1, wherein the 
nucleic acid is DNA encoding the integrase . 
interactor l gene that is free of one or more 
introns present in genomic DNA. 

3. The DNA of claim 1 labelled with a detectable 
moiety selected from a group consisting of a 
fluorescent label, a radioactive atom, and a 
chemiluminescent label . 

4. Isolated DNA of claim 1. 

5 . The isolated DNA of claim 4 , wherein the DNA is 
cDNA. 

6. The isolated DNA of claim 4, wherein the DNA is 
genomic DNA. 

7. The isolated DNA of claim 2 labelled with a 
detectable moiety selected from a group consisting 
of a fluorescent label, a radioactive atom, and a 
chemiluminescent label . 

8. A replicable vector comprising the nucleic acid of 
claim 1 . 

9. The replicable vector of claim 8, wherein the 
nucleic acid is free of introns. 

10. A plasmid of claim 8. 

11 . A plasmid of claim 9 . 
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12. A host cell containing the vector of claim 8. 

13. A host cell containing the vector of claim 9. 

14. The host cell of claim 12, wherein the cell is a 
eukaryotic cell. 

15. The host cell of claim 12, wherein the cell is a 
bacterial cell. 

16. The host cell of claim 13, wherein the cell is a 
eukaryotic cell . 

17. The host cell of claim 13, wherein the cell is a 
bacterial cell. 

18. A purified polypeptide comprising naturally- 
occurring ini-1. 

19. The purified polypeptide of claim 18, wherein the 
polypeptide is the product of prokaryotic or 
eiikaryotic expression of an exogenous DNA 
sequence . 

20. The purified polypeptide of claim 19, wherein the 
exogenous DNA sequence is a cDNA sequence * 

21. The purified polypeptide of claim 18, wherein the 
ini-l is human ini-1. 

22. The purified polypeptide of claim 19, wherein the 
exogenous DNA sequence is a genomic DNA sequence. 

23. The purified polypeptide of claim 19, wherein the 
exogenous DNA sequence is carried on an 
autonomously replicating DNA plasmid or viral 
vectors - 
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24. The purified polypeptide of claim 18, possessing 
part or all the amino acid sequence of human ini-1 
as shown in Figure 4 or any naturally occurring 
allelic variant thereof. 

5 

25. The purified polypeptide of claim 18 which has in 
vivo biological activity of naturally occurring 
ini - 1 . 



10 26. The purified polypeptide of claim 18 which has in 

vitro biological activity of naturally occurring 
ini - 1 . 



27. The purified polypeptide of claim 18 further 
15 characterized by being covalently associated with 

a detectable label substance. 

28 . A method of determining whether a compound is 
capable of interfering with the formation of a 

20 complex between a retrovirus integrase protein and 

an ini-l protein, which comprises: 

a) incubating the compound with atn 
appropriate ini-1 affinity fusion 
25 protein and the retrovirus integrase 

protein; 



b) contacting the incubate of step (a) with 
an appropriate affinity medium under 
30 conditions so as to . bind the ini-i 

affinity protein complex, if such a 
complex forms; and 



c) measuring the amount of the ini-1 
35 affinity protein complex formed in step 

(b) so as to determine whether the 
compound is capable of interfering with 
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the formation of the complex between the 
retrovirus integrase protein and the 
ini-l protein, 

29. The method of claim 28, wherein the retrovirsus 
integrase protein is HIV-1 IN. 



1-0 



30. The method of claim 28, wherein the affinity 
fusion protein is GST*ini-l 

31. The method of claim 28, wherein the affinity 
medium is glutathione -agarose beads. 



32. The method claim 28, wherein the amount of the 
15 ini-l affinity protein complex formed is 

determined using monoclonal antibodies. 

33. The method of claim 28, wherein the amount of the 
ini-l affinity protein complex formed is 

20 determined using polyclonal antibodies, 

34. The method of claim 28, wherein the ini-l affinity 
protein complex is bound to the affinity medium. 

25 35. The method of claim 34, wherein the ini-l affinity 

protein complex is purified and removed from the 
affinity medium and the amount of integrase 
protein is determined. 

30 36. A method for determining whether a compound is 

capable of interfering with the formation of a 
complex between -a retrovirus integrase protein and 
an ini-l protein, which comprises: 



35 



a) incubating the compound with an 
appropriate retrovirus integrase 
affinity fusion protein and the ini-l 
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protein; 



b) contacting the incubate of step (a) with 
an appropriate affinity medium under 
conditions so as to bind the retrovirsus 
integrase affinity protein complex, if 
such a complex forms; and 

c) measuring the amount of the retrovirsus 
integrase affinity protein complex 
formed in step (b) so as to determine 
whether the compound is capable of 
interfering with the formation of the 
complex between the retrovirus integrase 
protein and the ini--l protein. 

The method of claim 36, wherein the affinity 
medium is glutathione -agarose beads. 

The method claim 36, wherein the amount of the 
affinity protein complex formed is determined 
using monoclonal antibodies - 

The method claim 36, wherein the amount of the 
affinity protein complex formed is determined 
using polyclonal antibodies . 

The method of claim 36, wherein the retrovirus 
integrase affinity protein complex is bound to the 
affinity medium. 

The method of claim 40, wherein the retrovirus 
integrase affinity protein complex is purified and 
removed from the affinity medium and an amount of 
ini-l protein is determined. 

A method of disrupting a retrovirus life cycle in 
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a cell which comprises contacting the cell with a 
compound which is capable of disrupting a 
retrovirus integrase protein- ini-1 protein 
interaction so as to thereby disrupt . the 
retrovirus life cycle. 

43. The method of claim 41, wherein the compound 
contacting the cell is a soluble ini-l fragment. 

44. The method of claim 41, wherein the compound 
contacting the cell is a soluble HIV-1 IN 
fragment . 

45. The method of claim 41, wherein the compoiind 
contacting the cell is a chemical molecule* 

46. A method of disrupting a retrovirus life cycle in 
a mammal which comprises administering to the 
mammal a compound which is capable of disrupting 
a retrovirus integrase protein- ini-l protein 
interaction so as to thereby disrupt the 
retrovirus life cycle. 

47. The method of claim 46, wherein the confound 
administered to the mammal is a soliible ini-1 
fragment. 

46, The method of claim 46, wherein the compound 
administered to the mammal is a soluble IN 
fragment . 

49. The method of claim 46, wherein the compoxjuid 
administered to the mammal is a chemical molecule. 
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